13 research outputs found

    Representation learning on relational data

    Get PDF
    Humans utilize information about relationships or interactions between objects for orientation in various situations. For example, we trust our friend circle recommendations, become friends with the people we already have shared friends with, or adapt opinions as a result of interactions with other people. In many Machine Learning applications, we also know about relationships, which bear essential information for the use-case. Recommendations in social media, scene understanding in computer vision, or traffic prediction are few examples where relationships play a crucial role in the application. In this thesis, we introduce methods taking relationships into account and demonstrate their benefits for various problems. A large number of problems, where relationship information plays a central role, can be approached by modeling data by a graph structure and by task formulation as a prediction problem on the graph. In the first part of the thesis, we tackle the problem of node classification from various directions. We start with unsupervised learning approaches, which differ by assumptions they make about the relationship's meaning in the graph. For some applications such as social networks, it is a feasible assumption that densely connected nodes are similar. On the other hand, if we want to predict passenger traffic for the airport based on its flight connections, similar nodes are not necessarily positioned close to each other in the graph and more likely have comparable neighborhood patterns. Furthermore, we introduce novel methods for classification and regression in a semi-supervised setting, where labels of interest are known for a fraction of nodes. We use the known prediction targets and information about how nodes connect to learn the relationships' meaning and their effect on the final prediction. In the second part of the thesis, we deal with the problem of graph matching. Our first use-case is the alignment of different geographical maps, where the focus lies on the real-life setting. We introduce a robust method that can learn to ignore the noise in the data. Next, our focus moves to the field of Entity Alignment in different Knowledge Graphs. We analyze the process of manual data annotation and propose a setting and algorithms to accelerate this labor-intensive process. Furthermore, we point to the several shortcomings in the empirical evaluations, make several suggestions on how to improve it, and extensively analyze existing approaches for the task. The next part of the thesis is dedicated to the research direction dealing with automatic extraction and search of arguments, known as Argument Mining. We propose a novel approach for identifying arguments and demonstrate how it can make use of relational information. We apply our method to identify arguments in peer-reviews for scientific publications and show that arguments are essential for the decision process. Furthermore, we address the problem of argument search and introduce a novel approach that retrieves relevant and original arguments for the user's queries. Finally, we propose an approach for subspace clustering, which can deal with large datasets and assign new objects to the found clusters. Our method learns the relationships between objects and performs the clustering on the resulting graph

    LASAGNE: Locality And Structure Aware Graph Node Embedding

    Full text link
    In this work we propose Lasagne, a methodology to learn locality and structure aware graph node embeddings in an unsupervised way. In particular, we show that the performance of existing random-walk based approaches depends strongly on the structural properties of the graph, e.g., the size of the graph, whether the graph has a flat or upward-sloping Network Community Profile (NCP), whether the graph is expander-like, whether the classes of interest are more k-core-like or more peripheral, etc. For larger graphs with flat NCPs that are strongly expander-like, existing methods lead to random walks that expand rapidly, touching many dissimilar nodes, thereby leading to lower-quality vector representations that are less useful for downstream tasks. Rather than relying on global random walks or neighbors within fixed hop distances, Lasagne exploits strongly local Approximate Personalized PageRank stationary distributions to more precisely engineer local information into node embeddings. This leads, in particular, to more meaningful and more useful vector representations of nodes in poorly-structured graphs. We show that Lasagne leads to significant improvement in downstream multi-label classification for larger graphs with flat NCPs, that it is comparable for smaller graphs with upward-sloping NCPs, and that is comparable to existing methods for link prediction tasks

    On the Ambiguity of Rank-Based Evaluation of Entity Alignment or Link Prediction Methods

    Full text link
    In this work, we take a closer look at the evaluation of two families of methods for enriching information from knowledge graphs: Link Prediction and Entity Alignment. In the current experimental setting, multiple different scores are employed to assess different aspects of model performance. We analyze the informativeness of these evaluation measures and identify several shortcomings. In particular, we demonstrate that all existing scores can hardly be used to compare results across different datasets. Moreover, we demonstrate that varying size of the test size automatically has impact on the performance of the same model based on commonly used metrics for the Entity Alignment task. We show that this leads to various problems in the interpretation of results, which may support misleading conclusions. Therefore, we propose adjustments to the evaluation and demonstrate empirically how this supports a fair, comparable, and interpretable assessment of model performance. Our code is available at https://github.com/mberr/rank-based-evaluation
    corecore